Formant and F0 Features for Speaker Verification
نویسندگان
چکیده
In this paper, the feature set of fundamental frequency, formant center frequencies, and formant bandwidths were used in speaker verification experiments using the database distributed by the Speaker Odyssey Workshop. The features were extracted using the Entropic Signal Processing System. The main classifier was a Gaussian Mixture Model system built by MIT Lincoln Laboratory, but tests were also run using a Vector Quantization classifer for comparison. Different normalization methods were utilized to try to improve results including Hnorm and spectral subtraction. Test results on the Speaker Odyssey database and also on the database used in the NIST 1998 Speaker Recognition Evaluation, are presented on Decision Error Trade-off (DET) curves. Speaker verification accuracy did not improve using these frequency based features, but the Equal Error Rate was within 10% between tests run with the small feature set of frequency based features compared to the standard large set of mel-frequency cepstral coefficients.
منابع مشابه
Forensic speaker verification using formant features and Gaussian mixture models
A new method for speaker verification based on formant features is presented. A UBM-GMM verification system is applied to semi-automatically extracted formant features. Speakerspecific vocal tract configurations, including the speakers’ variability, are incorporated in the speaker models. Speaker comparisons are expressed as likelihood ratios (the ratio of similarity to typicality). F1, F2 and ...
متن کاملEffect of Gender on Improving Speech Recognition System
Speech is the output of a time varying excitation excited by a time varying system. It generates pulses with fundamental frequency F0. This time varying impulse trained as one of the features, characterized by fundamental frequencyF0and its formant frequencies. These features vary from one speaker to another speaker and from gender to gender also. In this paper the effect of gender on improving...
متن کاملEffect of Gender on Improving Speech Recognition System
Speech is the output of a time varying excitation excited by a time varying system. It generates pulses with fundamental frequency F0. This time varying impulse trained as one of the features, characterized by fundamental frequencyF0and its formant frequencies. These features vary from one speaker to another speaker and from gender to gender also. In this paper the effect of gender on improving...
متن کاملNoise-robust speaker verification using F0 features
This paper proposes a noise-robust speaker verification method augmented by fundamental frequency (F0). The paper first describes a noise-robust F0 extraction method using the Hough transform. Then, it proposes a robust speaker verification method using multi-stream HMMs which fuse the extracted F0 and cepstral features. Experiments are conducted using fourconnected-digit utterances of Japanese...
متن کاملComparison of spectrum estimators in speaker verification: mismatch conditions induced by vocal effort
We study the problem of vocal effort mismatch in speaker verification. Changes in speaker’s vocal effort induce changes in fundamental frequency (F0) and formant structure which introduce unwanted intra-speaker variations to features. We compare seven alternative spectrum estimators in the context of melfrequency cepstral coefficient (MFCC) extraction for speaker verification. The compared vari...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001